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1.  Introduction/Background 


Stokes  imagery  forms  a  basis  consisting  of  {So,  Si,  S2,  S3},  where  Sf  +  Si  +  ,S32  <  S% .  ARL’s 


specially  calibrated  FLIR  camera  captures  the  whole  basis  of  data.  From  this  basis,  we 
specifically  look  at  the  linear  polarimetric  subset  {  So,  S),  S2},  where  S0  is  the  incidence  radiance 


and  Si  and  S2  specify  the  state  of  polarization.  Stokes  images  can  be  used  to  calculate  a  variety 


of  other  metrics.  However,  this  study  only  uses  DOLP  (degree  of  linear  polarization) 


and  ORT  (orientation  angle)  — 

S  2 


(1). 


Cluster  Analysis  groups  pixels  that  have  similar  values  together  where  similarity  is  measured 
using  a  distance  metric.  Since  man-made  objects  emit  radiation  with  a  higher  degree  of 
polarization  than  complex  natural  backgrounds  do,  cluster  analysis  can  help  us  separate  these 
objects  from  their  backgrounds.  In  particular,  the  cluster  analysis  highlights  the  regions  which 
are  more  polarized  and  suppresses  the  less  polarized  regions  (2).  This  will  be  useful  for 
separating  objects  such  as  tanks,  cars,  trucks  and  buildings  out  of  the  brush.  The  cluster 
algorithm  is  hierarchical  and  agglomerative  which  means  that  every  pixel  in  each  image  begins 
as  an  individual  cluster  and  the  two  closest  together  fuse  until  the  only  remaining  cluster  contains 
all  pixels  (2).  Likewise,  distance  is  recalculated  with  every  new  member.  Average  linkage  was 
used  to  create  the  clusters,  defining  distance  between  two  clusters  as  the  average  between  all 
members  (2). 


The  goal  of  this  study  is  to  compare  the  results  of  cluster  analysis  when  all  sixteen  possible 
combinations  of  input  parameters  are  used  in  order  to  maximize  the  utility  of  the  algorithm.  The 
algorithm  is  outlined  in  ARL-TR-4216.  All  results  were  created  using  this  algorithm  run  on 
images  taken  by  BED.  This  paper  introduces  the  methodology  of  the  investigation  and  explores 
the  statistical  significance  of  the  results  in  an  effort  to  establish  the  differences  amongst  the 
variable  combinations,  with  a  hope  of  establishing  an  input  preference. 


2.  Experiment/Calculations 


The  goal  of  this  study  is  to  compare  the  results  of  cluster  analysis  when  all  sixteen  possible 
combinations  of  input  parameters  are  used.  To  achieve  this  goal,  multiple  polarimetric  images 
were  explored  in  order  to  find  a  suitable  subset  for  the  investigation.  The  remainder  of  the  study 
used  only  a  single  image.  Once  this  image  was  selected,  the  algorithm  was  run  over  the 
subsection  of  this  image  containing  the  targets  using  all  sixteen  possible  combinations  of  input 


1 


parameters:  DolpOrtSO,  DolpOrtSl,  DolpOrtS2,  DolpSOSl,  DolpS0S2,  DolpSlS2,  OrtSOSl, 
OrtSOS2,  OrtSlS2,  S0S1S2,  DolpOrtSOSl,  DolpOrtSOS2,  DolpOrtSl S2,  DolpSOSl S2, 

OrtSOSl S2  and  DolpOrtSOSl S2.  The  output  from  the  algorithm  included  cluster  images  that 
diagrammed  the  group  location  for  each  pixel  during  all  iterations,  as  well  as  the  number  of 
pixels  in  each  group  and  the  mean  values  for  each  of  these  groups  and  their  standard  deviations. 
From  these  outputs,  specific  values  were  gathered  and  compiled  onto  an  Excel™  chart  for  further 
evaluation.  For  this  study,  the  final  iterations  containing  only  two  cluster  groups  were 
considered,  where  the  two  groups  implied  the  target  and  background.. 


First,  the  mean  values  for  the  groups  were  statistically  analyzed  to  see  if  the  two  groups  were 
significantly  different  from  each  other.  This  comparison  was  completed  using  a  simple 
hypothesis  test  for  the  difference  of  two  means.  The  null  hypothesis,  H0 :  jux  -  //2  =  0  and  the 

alternative  hypothesis,  HA  :  nx-  ju2  *  0  were  evaluated.  jux  and  //2  represent  the  means  of  the 


two  groups.  Thus,  the  null  hypothesis  states  that  there  is  no  significant  difference  between  the 
means,  whereas  the  alternative  hypothesis  implies  that  there  is  a  significant  difference  between 
the  means.  This  hypothesis  was  tested  at  both  a  =  .05  and  a  =  .001  significance  levels,  a  =  .05 
tells  us  that  95%  of  the  time  we  can  be  sure  that  the  results  of  the  test  are  accurate,  but  a  =  .001 
represents  an  even  stronger  statement,  stating  that  99.9%  of  the  time  the  results  of  the  test  are 
accurate.  Once  the  significance  levels  were  chosen,  the  tcriticai  values  were  calculated,  where  ni  is 
the  number  of  pixels  in  the  large  group  and  m  the  number  of  pixels  in  the  small  group: 


'critical  ' 


'critical  ' 


—,nt  +  n2  -  2  I  =  t  (.025,  nx  +  n2  -  2)  for  a  =  .05  and 

v  2  J 


— ,  n.  +  n0  -  2 
2 


table.  Next,  the  tsampie  values  were  calculated:  /, 


=  ^(.0005,^  +n2 -2)  fora  =  .001  and  recorded  from  the  t-distribution 

_  (xi-x2)-H0 


sample 


,  2 

2 

(T. 

o\ 

+  -^ 

\  n. 

n l 

Thus,  we  can  compare  the 


values  for  tcriticai  and  tsampie.  When  'sample  ^  'critical  ,  the  null  hypothesis  is  rejected,  implying  that 
there  is  a  significant  difference  between  the  means  of  the  two  groups.  However,  if  t. 


sample  ^  'critical  5 


the  null  hypothesis  is  accepted  and  there  is  no  significant  difference  between  the  means  of  the 
two  groups. 


More  specifically,  here  is  an  example  of  the  hypothesis  test  using  Dolp  values  from  the 
DolpOrtSO  combination.  From  the  data,  xx  =  0.095673 ,  the  mean  of  the  first  group,  and 

x2  =  0.081788 ,  the  mean  of  the  second  group,  n,  =  943 ,  the  size  of  the  first  group,  n2  =  18  ,  the 
size  of  the  second  group,  crx  =  0.002203 ,  standard  deviation  of  the  first  group,  and 
cr2  =  0.013361 ,  standard  deviation  of  the  second  group.  The  hypothesis  test  for  a  =  .05 
progresses  as  follows: 
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H0  •  P\  P 2  0 

Ha  :  A  -  ^2  ^  0 


^critical  ^ 


^  a  ^ 


—  ,«j  +«2  -2  =  f  (.025,959)  =  1.96 

v  2  y 


t. 


sample 


_  (jci-jc2)-H0  _  (0.095673 -0.08 1788) -0  _ 


J 

I  2  2 

I^L  +  ^ 

V 

n,  n. 

/0.0022032  0.0 1 3  3  6 12 


=  4.407 


943 


18 


This  gives  a  tcritiCai  of  1 .96  and  a  tsampie  of  4.407.  Since  4.407  >  1 .96  ,  fsample  >  /trilical  and  the  null 

hypothesis  is  rejected.  Thus,  at  the  95%  confidence  level,  the  means  of  the  two  groups  are 
significantly  different.  However,  since  this  analysis  has  a  5%  error,  the  99.9%  confidence  level 
using  a  =  .001  is  also  evaluated  as  follows: 

Ho  •  Pi  —  Pi  —  0 


^critical  ^ 


a 


nl+n2- 2  t(. 0005, 959)  =  3.291 


t. 


(xi _X 2)_Ho  _  (0.095673 -0.08 1788) -0 


sample 


I  _2  2 

ai  |  ^2 

n2 


/0.0022032  0.0 1 336 12 


=  4.407 


943 


18 


This  gives  a  tsampie  of  4.407  and  a  tcriticai  of  3.291,  where  4.407  >3.291  and  thus  tsample  >  tcrilical ,  so 

the  null  hypothesis  is  rejected,  and  there  is  a  significant  difference  in  the  means  at  the  99.9% 
confidence  level. 


In  addition  to  comparing  the  means  of  the  two  groups,  the  proportion  of  pixels  in  the  smaller 
group  was  compared  to  zero  in  order  to  see  if  this  proportion  was  significantly  different  from 
zero.  A  hypothesis  test  for  significance  was  used  for  this  as  well,  using  the  proportion  calculated 
#  PixelsSmallGroup 


n  = 


The  hypothesis  tested  follows  H0 :  n  =  0  or  HA  :  n  ^  0  .  The  null 


#PixeIsTotaI 

hypothesis  questions  whether  n  is  the  same  as  zero,  or,  as  the  alternative  hypothesis  states,  n  is 
significantly  different  from  zero.  Once  again,  this  was  tested  at  both  the  95%  and  99.9% 
confidence  levels,  using  a  =  .05  and  a  =  .001 .  The  critical  value  is  calculated  as  such: 


^critical  ^ 


a 

— ,  n  - 1 


=  t(.025,n-l)  for  a  =  .05  and  t 


critical 


=  t 


( a  ^ 


—  ,n  —  1  =  t (.0005,n-l)  for 
2 


3 


(P)-Ho  - 

a  =  .001 ,  and  the  t-sample  is  calculated:  fsample  =  .  P  is  the  estimated  proportion  of 

p('-p) 

pixels  in  the  small  group.  If  t  le  >  tcritical  the  null  hypothesis  is  rejected,  stating  the  proportion 
of  pixels  in  the  smaller  group  is  significantly  different  from  zero;  however,  if  ts  le  <  f  ntical ,  the 
alternative  hypothesis  is  accepted  and  the  proportion  is  significantly  different  from  zero. 

For  an  example  from  the  analysis,  consider  P  =  0.018730489 ,  the  estimated  proportion  of  pixels 
in  the  small  group  from  the  DolpOrtSO  combination.  The  hypothesis  test  for  a  =  .05  follows: 

H0:n  =  0 
Ha  :n*o 


^critical  ^ 


y  ,n  -  lj  =  f  (.025,960)  =  1.96 


t  sample 


(*) 


-He 


0.018730489-0 

I  p(l-p)  /O.0 1 8730489  (1  —  0.01 8730489) 

V  960 


=  4.283 


This  gives  tcriticai  =  1.96  and  tsampie  =  4.283,  and  therefore  4.283  >  1.96 ,  and  fample  >  tcritical  so  the 

null  hypothesis  is  rejected.  Thus,  at  the  95%  confidence  level,  the  proportion  of  pixels  in  the 
small  group  for  DolpOrtSO  is  significantly  different  from  zero.  To  strengthen  this  conclusion, 
look  at  the  hypothesis  test  fora  =  .001 : 

H0:n  =  0 

HA:n*0 


^critical  ^ 


fa  O' 

—  ,77-1 

v  2  j 


=  t(. 0005, 960)  =  3.291 


t. 


(*) 


-He 


0.018730489-0 


sample 


P(l-P)  0.018730489(1-0.018730489) 
„ -  V  960 


=  4.283 


This  gives  tcriticai  =  3.291  and  tsampie  =  4.283,  4.283  >  3.291  and  tsarnplc  >  tcntical ,  thus,  the  null 

hypothesis  is  rejected  and  at  the  99.9%  confidence  interval,  the  proportion  remains  significantly 
different  from  zero. 
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3.  Results  and  Discussion 


This  investigation  focuses  on  the  outputs  from  the  cluster  algorithm  analysis  of  one  image 
subsection  run  with  all  possible  variable  input  combinations.  More  specifically,  the  investigation 
uses  the  data  from  the  two  group  clusters  as  outputted  from  the  cluster  algorithm,  after  it  was  run 
on  the  visible  image  subsection  found  in  figure  1 .  As  seen  in  this  image,  two  large  tanks  reside 
in  the  center;  one  closer  to  the  top  and  the  other  the  bottom. 


Figure  1.  Visible  image  subsection. 

The  outputs  from  the  algorithm  included  pictures  showing  which  group  the  pixels  ended  up  in 
based  on  the  principle  component  analysis.  Figure  2  shows  the  resulting  groups  from  the  image 
based  on  the  DolpOrtSO  combination  of  variables.  The  red  pixels  represent  the  location  of  the 
object  and  the  blue  pixels  show  the  background.  Notice  the  main  clump  of  red  pixels  lies  in  the 
center,  approximately  where  the  lower  vehicle  appears  in  figure  1 .  The  algorithm  determined  at 
the  two  cluster  level  that  the  lower  vehicle  is  more  different  from  the  background  than  the  upper 
vehicle. 
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DolpOrtSO 


Cluster  Plot  for  2  clusters 


Figure  2.  DolpOrtSO  cluster  plot  for  two  clusters. 

Figure  3  shows  the  resulting  two  groups  based  on  the  DolpOrtSOSlS2  combination  of  variables. 
This  image  highlights  a  different  area  of  the  image  than  the  previous  image  does,  demonstrating 
the  necessity  of  the  investigation.  While  this  report  only  shows  a  sample  of  the  images,  it  is 
evident  from  the  entire  set  that  every  combination  of  variables  highlights  a  slightly  different 
portion.  Therefore,  it  becomes  necessary  to  look  at  the  different  combinations  prior  to 
establishing  a  preference. 

DolpOrtSOS1S2 

Cluster  Plot  for  2  clusters 
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Figure  3.  DolpOrtSOSlS2  cluster  plot  for  two  clusters. 

The  algorithm  produced  many  outputs,  ranging  from  images  like  those  above,  to  statistics  on  the 
number  of  pixels,  means  and  standard  deviations  of  the  clusters.  From  this  data  it  is  quite 
apparent  that  the  different  combinations  place  a  different  number  of  pixels  in  the  smaller  group. 
This  number  ranges  from  2  to  51,  with  OrtSOSl  leading  and  DolpSlS2,  DolpOrtSlS2,  and 
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DolpS0SlS2  following  furthest  behind.  The  mean  number  of  pixels  in  the  small  group  was  14.7, 
the  median  8,  and  the  mode  2,  6,  and  8.  Figure  4  shows  the  distribution  of  pixels  between  groups 
1  and  2  across  all  combinations  of  variables. 


Figure  4.  Distribution  of  pixels  between  groups  1  and  2. 

The  proportions  of  pixels  in  the  small  group  were  compared  to  zero,  as  explained  in  the  previous 
section,  to  tell  whether  there  were  a  significant  number  of  pixels  in  the  group.  At  the  95% 
confidence  level,  14/16  proportions  were  significantly  greater  than  zero.  Flowever,  at  the  99.9% 
confidence  level,  only  7/16  proportions  remain  significantly  higher  than  zero.  The  more 
significantly  different  from  zero  the  proportion  is,  the  more  likely  that  the  analysis  highlighted  an 
object  amongst  the  natural  background.  The  following  chart  shows  the  specific  results  from  the 
hypothesis  testing.  Listed  next  to  the  combination  of  variables  are  their  relevant  proportions,  and 
either  a  yes  or  a  no  for  the  95%  confidence  level  and  99.9%  confidence  level,  indicating  as  to 
whether  the  proportion  is  significant.  The  yellow  highlighted  “No”  boxes  indicate  an 
insignificant  proportion,  and  the  white  “Yes”  boxes  indicated  a  significant  proportion. 
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Table  1 .  Results  of  the  proportion  hypothesis  testing. 


Combination 

Proportion 

95% 

99.9% 

DolpOrtSO 

0.0187 

Yes 

Yes 

DolpOrtS  1 

0.0083 

Yes 

No 

DolpOrtS2 

0.0166 

Yes 

Yes 

DolpSOSl 

0.0353 

Yes 

Yes 

DolpS0S2 

0.0083 

Yes 

No 

DolpSlS2 

0.0020 

No 

No 

OrtSOSl 

0.0530 

Yes 

Yes 

OrtSOS2 

0.0062 

Yes 

No 

OrtS  1 S2 

0.0291 

Yes 

Yes 

S0S1S2 

0.0062 

Yes 

No 

DolpOrtSOSl 

0.0301 

Yes 

Yes 

DolpOrtSOS2 

0.0020 

No 

No 

DolpOrtS  1S2 

0.0114 

Yes 

Yes 

DolpSOSl  S2 

0.0020 

Yes 

No 

OrtSOSl  S2 

0.0062 

Yes 

No 

DolpOrtSOSl  S2 

0.0083 

Yes 

No 

This  table  displays  the  results  from  the  proportion  analysis.  The  first  column, 
combination,  refers  to  the  combination  of  input  values.  The  second  column, 
proportion,  gives  the  number  of  pixels  in  the  small  cluster  over  the  number  of  pixels 
total.  The  third  and  fourth  columns  display  the  results  of  the  hypothesis  test  for  the 
95%  and  99.9%  significance  levels  respectively. 


The  mean  values  of  the  two  groups  for  each  combination  of  variables  were  compared  to  see  if 
there  was  a  significant  difference,  as  explained  previously.  As  expected,  it  was  found  that  the 
majority  of  the  clusters  had  significantly  different  values  for  the  mean  of  each  group.  Table  2 
shows  the  specific  results  of  the  hypothesis  testing.  Listed  next  to  each  combination  of  variables 
are  the  three  to  five  specific  variables  they  are  comprised  of,  and  the  results  of  the  hypothesis  test 
for  a  difference  of  mean  values  for  each  parameter  at  the  95%  and  99.9%  confidence  levels.  At 
the  95%  confidence  level,  9/55  means  were  insignificantly  different  and  21/55  of  the  mean 
values  were  insignificantly  different  at  the  99.9%  confidence  level.  The  insignificant  differences 
are  marked  with  a  yellow  highlighted  box  and  a  “No”,  and  the  significant  differences  are  marked 
with  a  white  “Yes”  box. 
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Table  2.  Results  of  the  mean  values  of  clusters  comparison  hypothesis  testing. 


Combination 

Parameter 

95% 

99.9% 

DolpOrtSO 

Dolp 

Yes 

Yes 

Ort 

Yes 

Yes 

SO 

Yes 

Yes 

DolpOrtSl 

Dolp 

No 

No 

Ort 

Yes 

Yes 

SI 

Yes 

Yes 

DolpOrtS2 

Dolp 

Yes 

Yes 

Ort 

Yes 

Yes 

S2 

Yes 

Yes 

DolpSOSl 

Dolp 

Yes 

No 

SO 

No 

No 

SI 

Yes 

Yes 

DolpS0S2 

Dolp 

Yes 

Yes 

SO 

Yes 

No 

S2 

Yes 

Yes 

DolpSlS2 

Dolp 

Yes 

Yes 

SI 

Yes 

Yes 

S2 

No 

No 

OrtSOSl 

Ort 

Yes 

Yes 

SO 

Yes 

Yes 

SI 

Yes 

Yes 

OrtSOS2 

Ort 

Yes 

Yes 

SO 

Yes 

No 

S2 

Yes 

No 

OrtSlS2 

Ort 

Yes 

Yes 

SI 

Yes 

Yes 

S2 

Yes 

No 

S0S1S2 

SO 

Yes 

No 

SI 

Yes 

Yes 

S2 

Yes 

No 

DolpOrtSOSl 

Dolp 

Yes 

Yes 

Ort 

Yes 

Yes 

SO 

No 

No 

SI 

Yes 

Yes 

DolpOrtSOS2 

Dolp 

Yes 

Yes 

Ort 

Yes 

Yes 

SO 

No 

No 

S2 

No 

No 

DolpOrtSl  S2 

Dolp 

Yes 

No 

Ort 

Yes 

Yes 

SI 

Yes 

Yes 

S2 

Yes 

Yes 
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Combination 

Parameter 

95% 

99.9% 

SO 

No 

No 

SI 

Yes 

Yes 

S2 

No 

No 

OrtSOSlS2 

Ort 

Yes 

Yes 

SO 

Yes 

No 

SI 

Yes 

Yes 

S2 

Yes 

No 

DolpOrtSOSlS2 

Dolp 

No 

No 

Ort 

Yes 

Yes 

SO 

Yes 

No 

SI 

Yes 

Yes 

S2 

Yes 

No 

This  table  summarizes  the  results  from  the  hypothesis  tests  on  whether  or  not  there  was  a 
significant  difference  between  the  means  of  the  two  clusters.  The  first  column,  combination 
specifies  the  combination  of  input  variables.  The  second  column,  variable,  specifies  the  specific 
variable  compared.  Finally,  the  third  and  fourth  columns  display  the  results  of  the  hypothesis 
test  for  the  95%  and  99.9 %  significance  levels  respectively. 


4.  Summary  and  Conclusions 


This  investigation  explored  the  output  of  a  cluster  analysis  algorithm  compared  over  all  possible 
combinations  of  input  parameters.  Due  to  the  difference  in  parameters,  it  was  expected  that  the 
output  would  vary  based  on  the  combination.  As  seen  in  figure  3,  the  algorithm  groups  the 
pixels  very  differently  across  all  combinations.  In  addition,  it  was  expected  that  the  differences 
in  the  means  between  the  two  groups  would  be  significantly  different  because  these  groups  were 
determined  based  on  the  individual  pixels.  It  was  found  that  the  majority  of  the  groups  had 
significantly  different  mean  values  at  the  95%  confidence  level,  and  of  this  majority,  most 
remained  significant  at  the  99.9%  confidence  level.  Based  on  the  variety  of  output  as  seen  in 
figure  3,  it  was  to  be  expected  that  some  of  the  small  groups  would  not  contain  a  significant 
proportion  of  pixels  compared  to  zero.  It  was  found  that  14/16  of  the  data  sets  had  a  significant 
proportion  of  pixels  in  the  smaller,  perceived  object  cluster  group  at  the  95%  confidence  level, 
with  7/14  remaining  significant  at  the  99.9%  confidence  level.  The  next  step  is  to  analyze  even 
more  specific  statistics  from  these  combinations  and  to  repeat  this  analysis  using  more  sample 
images  with  different  backgrounds,  targets,  and  weather  conditions. 
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